Deep multi-scale video prediction beyond mean square error

نویسندگان

  • Michaël Mathieu
  • Camille Couprie
  • Yann LeCun
چکیده

Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction is viewed as a promising avenue for unsupervised feature learning. In this work, we train a convolutional network to generate future frames given an input sequence. To deal with the inherently blurry predictions obtained from the standard Mean Squared Error (MSE) loss function, we propose three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function. We compare our predictions to different published results based on recurrent neural networks on the UCF101 dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi Resolution LSTM For Long Term Prediction In Neural Activity Video

Epileptic seizures are caused by abnormal, overly synchronized, electrical activity in the brain. The abnormal electrical activity manifests as waves, propagating across the brain. Accurate prediction of the propagation velocity and direction of these waves could enable realtime responsive brain stimulation to suppress or prevent the seizures entirely. However, this problem is very challenging ...

متن کامل

Prediction of ultimate strength of shale using artificial neural network

A rock failure criterion is very important for prediction of the ultimate strength in rock mechanics and geotechnics; it is determined for rock mechanics studies in mining, civil, and oil wellborn drilling operations. Also shales are among the most difficult to treat formations. Therefore, in this research work, using the artificial neural network (ANN), a model was built to predict the ultimat...

متن کامل

Instance Similarity Deep Hashing for Multi-Label Image Retrieval

Hash coding has been widely used in the approximate nearest neighbor search for large-scale image retrieval. Recently, many deep hashing methods have been proposed and shown largely improved performance over traditional featurelearning-based methods. Most of these methods examine the pairwise similarity on the semantic-level labels, where the pairwise similarity is generally defined in a hard-a...

متن کامل

BUL in MediaEval 2016 Emotional Impact of Movies Task

This paper describes our working approach for the Emotional Impact of Movies task of MediaEval 2016. There are 2 sub-tasks set to make affective predictions, based on Arousal and Valence values, on video clips. Sub-task 1 requires global emotion prediction. Here a framework is developed using Deep Auto-Encoders, a feature variation algorithm and a Deep network. For sub-task 2, a set of audio fe...

متن کامل

Fixed and Adaptive Predictors for Hybrid Predictive/Transform Coding

Hybrid predictive/transform coding is studied. The usual formulation is to first apply a unitary transform and then code the transform coefficients with independent DPCM coders, i.e., the prediction is performed in the transform domain. This structure is compared to spatial domain prediction, where a difference. signal is formed in the spatial domain and then coded by a transform coder. A linea...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1511.05440  شماره 

صفحات  -

تاریخ انتشار 2015